In [ ]:
%%HTML
<style>
.container { width:100% }
</style>
The function find_maximum that is defined below takes four arguments:
f is a function of the form $\texttt{f}: \mathbb{R}^n \rightarrow \mathbb{R}$.
It is assumed that the function f is convex and therefore there is only one global maximum. gradF is the gradient of the function f.start is a numpy array of numbers that is used to start the search for a maximum.eps is a small floating point number. This number controls the precision.
If the values of f change less than eps, then the algorithm stops.The function find_maximum returns a triple of values of the form
$$ (x_{max}, \texttt{fx}, \texttt{cnt}) $$
The algorithms computes a sequence $(x_n)_n$ that is defined inductively:
The algorithm given below adjusts the learning rate $\alpha$ dynamically: If $f(x_{n+1}) > f(x_n)$, then the learning rate alpha is increased by a factor of $1.2$. Otherwise, the learning rate is decreased by a factor of $\frac{1}{2}$. This way, the algorithm determines the optimal learning rate by itself.
In [ ]:
def findMaximum(f, gradF, start, eps):
x = start
fx = f(x)
alpha = 1.0 # learning rate
cnt = 0 # number of iterations
while True:
cnt += 1
xOld, fOld = x, fx
x += alpha * gradF(x)
fx = f(x)
print(f'cnt = {cnt}, f({x}) = {fx}')
print(f'gradient = {gradF(x)}')
if abs(x - xOld) <= abs(x) * eps:
return x, fx, cnt
if fx <= fOld: # f has not increased, learning rate too high
alpha *= 0.5
print(f'decrementing: alpha = {alpha}')
x, fx = xOld, fOld
continue
else: # f has increased
alpha *= 1.2
print(f'incrementing: alpha = {alpha}')
In [ ]:
import numpy as np
We will try to find the maximum of the function $$ f(x) := \sin(x) - \frac{x^2}{2} $$
In [ ]:
def f(x):
return np.sin(x) - x**2 / 2
Let us plot this function.
In [ ]:
import matplotlib.pyplot as plt
import seaborn as sns
In [ ]:
X = np.arange(-0.5, 1.8, 0.01)
Y = f(X)
plt.figure(figsize=(15, 10))
sns.set(style='darkgrid')
plt.title('lambda x: sin(x) - x**2/2')
plt.axvline(x=0.0, c='k')
plt.axhline(y=0.0, c='k')
plt.xlabel('x')
plt.ylabel('y')
plt.xticks(np.arange(-0.5, 1.81, step=0.1))
plt.plot(X, Y, color='b')
Clearly, this function has a maximum somewhere between 0.7 and 0.8. Let us use gradient ascent to find it. In order to do so, we have to provide the derivative of this function. We have $$ \frac{\mathrm{d}f}{\mathrm{d}x} = \cos(x) - x. $$
In [ ]:
def fs(x):
return np.cos(x) - x
Let us plot the derivative together with the function.
In [ ]:
X2 = np.arange(0.4, 1.1, 0.01)
Ys = fs(X2)
plt.figure(figsize=(15, 10))
sns.set(style='darkgrid')
plt.title('lambda x: sin(x) - x**2/2 and its derivative')
plt.axvline(x=0.0, c='k')
plt.axhline(y=0.0, c='k')
plt.xlabel('x')
plt.ylabel('y')
plt.xticks(np.arange(-0.5, 1.81, step=0.1))
plt.yticks(np.arange(-0.6, 0.61, step=0.1))
plt.plot(X, Y, color='b')
plt.plot(X2, Ys, color='r')
In [ ]:
x_max, _, _ = findMaximum(f, fs, 0.0, 1e-15)
In [ ]:
x_max
The maximum seems to be at $x \approx 0.739085$. Let's check the derivative at this position.
In [ ]:
fs(x_max)
In [ ]: